Skip to content

docs: add tutorial for securing internal traffic with SPIRE (mTLS)#364

Open
mahil-2040 wants to merge 4 commits into
volcano-sh:mainfrom
mahil-2040:docs/add-spire-auth-guide
Open

docs: add tutorial for securing internal traffic with SPIRE (mTLS)#364
mahil-2040 wants to merge 4 commits into
volcano-sh:mainfrom
mahil-2040:docs/add-spire-auth-guide

Conversation

@mahil-2040
Copy link
Copy Markdown
Contributor

What type of PR is this?

/kind documentation

What this PR does / why we need it:

Description

This PR adds a comprehensive user guide (docs/tutorials/internal-auth-spire.md) for the newly implemented SPIRE mTLS internal authentication system.

The tutorial explains the architecture of our zero-trust control plane and acts as a definitive reference for developers and operators.

Key Additions

  • Architecture Overview: Explains how the Router and WorkloadManager cryptographically verify each other's SPIFFE identities, and how the CertWatcher achieves zero-downtime certificate rotation via fsnotify.
  • Configuration Guide: Documents the exact CLI flags required to enable mTLS on both components (--mtls-* for Router, --tls-* for WorkloadManager).
  • Sidecar Mechanics: Details how the spiffe-helper sidecar is deployed alongside the control plane pods to automatically provision and rotate SVIDs.
  • Architectural Justification: Adds a section explicitly explaining why PicoD and AgentRuntime sandboxes continue to use JWT authentication (avoiding TLS handshake overhead to preserve ultra-low cold-start latency, and keeping user-defined runtimes pure).

Which issue(s) this PR fixes:
Fixes #

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

NO

NONE

Copilot AI review requested due to automatic review settings May 27, 2026 17:44
@volcano-sh-bot volcano-sh-bot added the kind/documentation Improvements or additions to documentation label May 27, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request adds a new tutorial documentation file explaining how SPIRE is used to secure internal traffic with mTLS in AgentCube. The reviewer pointed out two critical discrepancies between the documentation and the actual codebase: the documentation references a CertWatcher and hot-reloading mechanism that does not exist in the code, and it lists incorrect CLI flags (--mtls-* and --tls-ca) that are not supported by the current implementation.

Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md Outdated
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md Outdated
@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented May 27, 2026

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 57.90%. Comparing base (524e55e) to head (bd05fe5).
⚠️ Report is 110 commits behind head on main.
❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@             Coverage Diff             @@
##             main     #364       +/-   ##
===========================================
+ Coverage   47.57%   57.90%   +10.33%     
===========================================
  Files          30       34        +4     
  Lines        2819     3181      +362     
===========================================
+ Hits         1341     1842      +501     
+ Misses       1338     1154      -184     
- Partials      140      185       +45     
Flag Coverage Δ
unittests 57.90% <100.00%> (+10.33%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isnot a good tutorial, think about if you are a new user, can you run it by this guide?

@mahil-2040 mahil-2040 force-pushed the docs/add-spire-auth-guide branch from 625d1fd to 8c07a2f Compare May 28, 2026 08:14
@mahil-2040
Copy link
Copy Markdown
Contributor Author

This isnot a good tutorial, think about if you are a new user, can you run it by this guide?

I have completely rewritten the guide with step by step instructions, commands and expected outputs of each step, PTAL!

@mahil-2040 mahil-2040 requested a review from hzxuzhonghu May 28, 2026 08:17
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md Outdated
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md Outdated
@mahil-2040 mahil-2040 requested a review from acsoto May 31, 2026 05:26
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md Outdated
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md Outdated
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md Outdated
Comment thread docs/agentcube/docs/tutorials/internal-auth-spire.md Outdated
Signed-off-by: Mahil Patel <mahilpatel0808@gmail.com>
Signed-off-by: Mahil Patel <mahilpatel0808@gmail.com>
- Added klog.Infof to wait.go so expected output is conistent with what actually appears in logs, matching the tutorial's expected output
- Helm upgrade removes SPIRE workloads and sidecars, not CRDs (those are removed separately via kubectl)
- added --reuse-values flag to preserve the install-time values

Signed-off-by: Mahil Patel <mahilpatel0808@gmail.com>
- Replaced generalized output placeholders (xxxxx pod hashes, ... UUIDs, and XX-XX timestamps) with actual outputs to prevent ambiguity.
- Updated expected log outputs for Router and WorkloadManager to accurately reflect the format emitted by the codebase.
- Fixed agentcube-system namespace inconsistencies across the documentation to align with the core getting-started guide.

Signed-off-by: Mahil Patel <mahilpatel0808@gmail.com>
@mahil-2040 mahil-2040 force-pushed the docs/add-spire-auth-guide branch from a145259 to bd05fe5 Compare June 1, 2026 17:48
@volcano-sh-bot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign kevin-wangzefeng, yaozengzeng for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Comment on lines +165 to +166
agentcube-router-574d98b76-tr2nr 2/2 Running 5 (2m24s ago) 3m17s
spire-agent-8r9jx 1/1 Running 3 (2m44s ago) 3m17s
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the cause of restart?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help file an issue

Copy link
Copy Markdown
Member

@hzxuzhonghu hzxuzhonghu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For

Comment on lines +165 to +166
agentcube-router-574d98b76-tr2nr 2/2 Running 5 (2m24s ago) 3m17s
spire-agent-8r9jx 1/1 Running 3 (2m44s ago) 3m17s
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help file an issue

podTemplate:
spec:
containers:
- name: agent
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where is this image?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/documentation Improvements or additions to documentation size/L

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants